-
Notifications
You must be signed in to change notification settings - Fork 2
Solve model.cond with custom materializer #36
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #36 +/- ##
==========================================
+ Coverage 67.67% 70.66% +2.98%
==========================================
Files 12 14 +2
Lines 495 559 +64
==========================================
+ Hits 335 395 +60
- Misses 160 164 +4
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
for more information, see https://pre-commit.ci
Finally made it! As usual with these things it was more complicated than I anticipated, but it seems to work quite robustly now. High-level description:
@const-ae, would be great if you could have a look, too. In particular, if the test cases are correct and if there's any additional (more complex?) model you think that should be tested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much @grst! This is a loooooot of work (although I have the suspicion that you enjoyed it :) )
I went through the PR but quickly noticed that it's a bit over my head at the moment. I'd need more headspace to give this a review that would deserve the name "review". I might give this another pass later, but I'm sure that the other reviews will have more useful comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having trouble following exactly what is going on here or what is solved...I will need to look back. I've never done much with formulaic so still a little bit dazzled....
factor_storage: dict[str, list[FactorMetadata]] = defaultdict(list) | ||
variable_to_factors: dict[str, set[str]] = defaultdict(set) | ||
|
||
class CustomPandasMaterializer(PandasMaterializer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need the class declaration inside of another class? This makes reasoning about what goes on a bit challenging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because each class needs to be tied to one specific factor_storage
object. The class gets instantiated by formulaic, therefore we can only pass a class, rather than an instance.
Each class is used only for one formulaic formula and stores the formula-specific metadata in the factor_storage
object that is tied to it.
I can understand that it all looks a bit daunting. I suggest to start reading at |
Co-authored-by: Ilan Gold <[email protected]>
Trying to solve #15 using a custom formulaic materializer.
~ A
~ A + B
~ A * B
~ A + C(A) + C(A, contr.treatment(base=x))
~ 0 + A
~ C(A, contr.treatment(base=x))
~ C(A, contr.sum)
~ A + continuous + np.log(continuous)